Simulation Studies in Optimistic Bayesian Sampling in Contextual-Bandit Problems
نویسندگان
چکیده
This technical report accompanies the article “Optimistic Bayesian Sampling in Contextual-Bandit Problems” by B.C. May, N. Korda, A. Lee, and D.S. Leslie [3].
منابع مشابه
Optimistic Bayesian Sampling in Contextual-Bandit Problems
In sequential decision problems in an unknown environment, the decision maker often faces a dilemma over whether to explore to discover more about the environment, or to exploit current knowledge. We address the exploration-exploitation dilemma in a general setting encompassing both standard and contextualised bandit problems. The contextual bandit problem has recently resurfaced in attempts to...
متن کاملProbability OPTIMISTIC BAYESIAN SAMPLING IN CONTEXTUAL - BANDIT PROBLEMS
In every sequential decision problem in an unknown environment, the decision maker faces a dilemma over whether to explore to discover more about the environment, or to exploit current knowledge. We address the exploration/exploitation dilemma in a general setting encompassing both standard and contextualised bandit problems. In this article we extend an approach of Thompson [13] which makes us...
متن کاملThompson Sampling for Contextual Bandits with Linear Payoffs
Thompson Sampling is one of the oldest heuristics for multi-armed bandit problems. It is a randomized algorithm based on Bayesian ideas, and has recently generated significant interest after several studies demonstrated it to have better empirical performance compared to the stateof-the-art methods. However, many questions regarding its theoretical performance remained open. In this paper, we d...
متن کاملA Practical Method for Solving Contextual Bandit Problems Using Decision Trees
Many efficient algorithms with strong theoretical guarantees have been proposed for the contextual multi-armed bandit problem. However, applying these algorithms in practice can be difficult because they require domain expertise to build appropriate features and to tune their parameters. We propose a new method for the contextual bandit problem that is simple, practical, and can be applied with...
متن کاملSequential Monte Carlo Bandits
In this paper we propose a flexible and efficient framework for handling multi-armed bandits, combining sequential Monte Carlo algorithms with hierarchical Bayesian modeling techniques. The framework naturally encompasses restless bandits, contextual bandits, and other bandit variants under a single inferential model. Despite the model’s generality, we propose efficient Monte Carlo algorithms t...
متن کامل